A Bit-Compatible Shared Memory Parallelization for ILU(k) Preconditioning and a Bit-Compatible Generalization to Distributed Memory
نویسندگان
چکیده
ILU(k) is an important preconditioner widely used in many linear algebra solvers for sparse matrices. Unfortunately, there is still no highly scalable parallel ILU(k) algorithm. This paper presents the first such scalable algorithm. For example, the new algorithm achieves 50 times speedup with 80 nodes for general sparse matrices of dimension 160,000 that are diagonally dominant. The algorithm assumes that each node has sufficient memory to hold the matrix. The parallelism is task-oriented. We present experimental results for k = 1 and k = 2, which are the most commonly used cases in the practical applications. The results are presented for three platforms: a departmental cluster with Gigabit Ethernet; a high-performance cluster using an InfiniBand interconnect; and a simulation of a Grid computation with two or three participating sites.
منابع مشابه
A Bit-Compatible Parallelization for ILU(k) Preconditioning
ILU(k) is a commonly used preconditioner for iterative linear solvers for sparse, non-symmetric systems. It is often preferred for the sake of its stability. We present TPILU(k), the first efficiently parallelized ILU(k) preconditioner that maintains this important stability property. Even better, TPILU(k) preconditioning produces an answer that is bit-compatible with the sequential ILU(k) prec...
متن کاملA Sub-threshold 9T SRAM Cell with High Write and Read ability with Bit Interleaving Capability
This paper proposes a new sub-threshold low power 9T static random-access memory (SRAM) cell compatible with bit interleaving structure in which the effective sizing adjustment of access transistors in write mode is provided by isolating writing and reading paths. In the proposed cell, we consider a weak inverter to make better write mode operation. Moreover applying boosted word line feature ...
متن کاملThe same-source parallel MM5
With the March 1998 release of the Penn State University/NCAR Mesoscale Model (MM5), the official version of the model (MM5v2 Release 8) now runs on distributed memory (DM) message-passing platforms. Under an IBM-funded effort, source translation and runtime library support minimize the impact of parallelization on the original model source code with the result that the majority of code is line...
متن کاملParallel Multilevel Block ILU Preconditioning Techniques for Large Sparse Linear Systems
We present a class of parallel preconditioning strategies built on a multilevel block incomplete LU (ILU) factorization technique to solve large sparse linear systems on distributed memory parallel computers. The preconditioners are constructed by using the concept of block independent sets. Two algorithms for constructing block independent sets of a distributed sparse matrix are proposed. We c...
متن کاملThe Data Diffusion Space for Parallel Computing in Clusters
The data diffusion space (DDS) is an all-software shared address space for parallel computing on distributed memory platforms. It is an extra address space to that of each process running a parallel application under the SPMD (Single Program Multiple Data) model. The size of DDS can be up to 2 bytes, either on 32or on 64-bit architectures. Data laid on DDS diffuses, or migrates and replicates, ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008